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Abstract 

This paper examines information-theoretic questions regarding the difficulty of 
compressing data versus the difficulty of decompressing data and the role that infor- 
mation loss plays in this interaction. Finite-state compression and decompression 
are shown to be of equivalent difficulty, even when the decompressors are allowed 
to be lossy. 

Inspired by Kolmogorov complexity, this paper defines the optimal decompres- 
sion ratio achievable on an infinite sequence by finite-state decompressors (that 
is, finite-state transducers outputting the sequence in question). It is shown that 
the optimal compression ratio achievable on a sequence S by any information loss- 
less finite state compressor, known as the finite-state dimension of S, is equal to 
the optimal decompression ratio achievable on S by any finite-state decompressor. 
This result implies a new decompression characterization of finite-state dimension 
in terms of lossy finite-state transducers. 

1 Introduction 

This paper addresses the fundamental information-theoretic question: is the problem of 
compressing data to a short representation of the same difficulty as the problem of decom- 
pressing data from a short representation? It is known that for certain cases admitting 
sufficient computational resources, both problems are indeed of equivalent difficulty. For 
example, consider the case of polynomial-space-bounded Kolmogorov complexity 
The shortest program computing a string x in polynomial space can be computed from 
x in polynomial space, by reusing space to conduct an exponential time search for short 
polynomial space programs for x. However, this result is not known to hold at lower 
levels of complexity, such as polynomial-time-bounded Kolmogorov complexity. At the 
level of unbounded computation, there is a known incongruity between compression and 
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decompression: a string is computable from its shortest program, but the converse does 
not hold. 

This paper settles the case at a level of computational complexity lower even than 
polynomial time: the finite-state level. It was already known [9} [TO] that, if attention 
is restricted to information lossless (IL) finite-state transducers [9], compression and 
decompression are of equivalent difficulty Our main result shows that we need not restrict 
attention to IL transducers to obtain this equivalence. Inspired by Kolmogorov complexity 
[TT] . we define the optimal decompression ratio achievable on an infinite sequence by 
(possibly lossy) finite-state transducers acting as decompressors. Our result implies that 
this quantity is equal to the optimal compression ratio achievable on the sequence with 
IL finite-state compressors. 

More precisely, given an infinite sequence S, Ziv and Lempel [20] defined the finite- 
state strong dimension of S (called the finite-state compressibility of S in |20j) to be 

Dim FS (S) = lim hmsup C Wlz(S [ n) ^ 

k >oo n — >00 ri 

where Cj LFS _ LZ (S' \ n) is the length of smallest string output by any information lossless 
finite-state transducer (ILFST) with at most k states, when given S \ n as input. An 
analogous quantity, the finite-state dimension dim F s(5') of S [Sid], is defined similarly, 
by replacing the limit superior in (11.11) with a limit inferior. Finite-state dimension and 
strong dimension are so called because they have been shown [3j [1] to be finite-state 
effectivizations, respectively, of classical Hausdorff dimension [5] and packing dimension 
[TJ2 [TT] , the two most widely-used fractal dimensions. Each admits a host of different 
characterizations, in terms of finite-state gamblers [21 [I] , entropy rates [201 [2] , information 
lossless finite-state compressors [201 El EE], and finite-state log- loss predictors [7]. This 
indicates that finite-state dimension is a robust and stable quantity that truly measures the 
information density of a sequence as perceived by finite-state machines, to a certain extent 
independent of the details of the particular finite-state machine model under consideration. 

An ILFST is a finite-state transducer (FST) that must create an output from which the 
input can be uniquely recovered, whereas a general FST has no such restriction. An ILFST 
T therefore cannot output small strings on most inputs, which limits which strings T can 
significantly compress. By contrast, the quantity C FS (x), defined similarly to C^ LFS _ LZ (x) , 
but without the IL requirement, is trivially equal to for all strings x, because a 1-state 
FST that always outputs the empty string compresses every string to length 0. This FST 
"cheats" by throwing away information contained in its input. Requiring the FST to be 
IL prevents this cheating and limits the compression performance of the FST. From this 
perspective, we consider the following two questions. 

1. Does the characterization of finite-state dimension still hold if we consider decom- 
pressors instead of compressors? That is, suppose the FST, rather than aiming 
to compress the given sequence to a more compact sequence, is instead aiming to 
expand a compact sequence into the given sequence. 
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2. If the answer to question [T] is yes, is it mandatory that the decompressor be IL 
in order to characterize finite-state dimension? In other words, would allowing a 
decompressor to be an arbitrary FST afford it more power to decompress than if it 
were IL, as in the case of compression? 

An affirmative answer to question CD follows in a straightforward manner from the well- 
known result P, [in] that every ILFST computes a function whose inverse is computable by 
another ILFST (in a technical sense described in Theorem 13.31) . The answer to question 
[2] is less obvious. There are clearly functions that are computable by a FST but not 
computable by any ILFST (for example, any constant function). Informally, question [2] 
asks, can a FST acting as a decompressor improve its performance - i.e. output a larger 
string than otherwise possible - by throwing away information? 

The main result of this paper answers question [2] negatively. We show that given a 
lossy FST T, there is an ILFST T' with the property that, for all strings x, the shortest 
input to V that outputs x is no larger than the shortest input to T that outputs x. 
Therefore, while T' cannot do everything that T can do, it can decompress as effectively 
as T. The intuitive reason this is possible is that, although T is lossy, optimally compressed 
input to T follows an "information lossless path" through T. We construct T' to preserve 
such IL paths, while amending only the "lossy paths" through T in order to make it IL. 

This result implies that the finite- state dimension of a sequence can be characterized 
in terms of the optimal decompression ratio achieved on the sequence by any finite-state 
decompressor. More precisely, define Dp S (a;) to be the length of the smallest string that 
produces x as output, when given as input to some FST that requires at most k bits to 
describe in a standard binary representation of FST'sQ We show that the finite-state 
strong dimension of a sequence S can be characterized by replacing Cj LFS _ LZ with Dp S in 
(11. ip (and analogously for finite-state dimension). 

One interpretation of D| s is as a finite-state adaptation of Kolmogorov complexity, 
with Kolmogorov complexity considered to measure "optimal decompression" at the level 
of unbounded computation. From this perspective, our finite-state dimension character- 
ization mirrors previously known characterizations of other effective dimensions, such as 
constructive dimension [131 Ej? computable dimension, and various space-bounded di- 
mensions such as polynomial-space dimension [T2J, Ej, in terms of Kolmogorov complexity 
or space-bounded Kolmogorov complexity It remains an open question whether 

polynomial-time dimension [12] can be characterized in terms of polynomial-time Kol- 
mogorov complexity (see [8] for a summary of recent progress on this question). 

After writing this paper, the authors became aware of a very similar result proven by 
Lempel, Sheinwald, and Ziv [16]. Therefore, our proof of Theorem 13.111 may be considered 
a new proof of Corollary 2.3 of [16J. 

Unlike in the quantity Cjlfs-lz use< i by Ziv and Lempel, the k in Dp S does not represent the number 
of states of the FST, but rather its total description length. This discrepancy is explained in SJH 
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2 Preliminaries 



2.1 Notation 

Throughout this paper, £ is a finite alphabet. N is the set of all nonnegative integers. All 
strings are elements of £*, and all sequences are elements of For all x G £*, we write 
|x| to denote the length of a;. For all /c G N, £ fc , £- fc , and £ <fc are the set of strings of 
length exactly k, at most k, and less than k, respectively. A denotes the empty string. If 
x is a string or sequence and i,j are integers, x[i . . j] denotes the string consisting of the 
i th through j th symbols in x, with x[i . . j] = A if j < i, noting that x[0] is the leftmost 
symbol in x, and we write x \ n to denote x[0 . . n — 1]. If w is a string and x is a string or 
sequence, we say w is & prefix of x, and we write w C x, if x = «)« for some w G £*, and 
we write w C z if w C i and w ^ x. We say it) is a suffix of x if x = uw for some u G £*, 
and we say to is a proper suffix of x if w is a suffix of x and w ^ x. For a set I C S*, we 
say X is suffix- free if, for all x, y G X, x is not a proper suffix of y. 

2.2 Finite-State Compression 

In this section, we develop a notion of finite-state compression and decompression that 
serves to measure the optimal amount by which strings and sequences can be compressed 
and decompressed by finite-state transducers. We base our model of finite-state trans- 
ducers on that studied in [3], which was introduced in a similar form by Shannon [15] 
and investigated by Huffman [9] and Ziv and Lempel [19] . Kohavi [10] gives an extensive 
treatment of the subject. 

A finite-state transducer (FST) is a 4-tuple 

T = (Q,5, i/, g ), 

where 

• Q is a nonempty, finite set of states, 

• 5 : Q x H — > Q is the transition function, 

• u : Q x H — >£*is the output function, 

• go ^ Q is the initial state. 

Furthermore, we assume that every state in Q is reachable from g . Given gi,g2 G Q 
and a G S such that 5(g 1; a) = g 2 , we refer to the triple (gi,a,g 2 ) as a transition arrow 
in the directed graph representing the FST, in order to emphasize where the arrow starts 
and ends, and what input symbol causes the FST to follow it. By this interpretation, 
if a 7^ a' but g 2 = S(qi,a) = 8(qi,a'), then (gi,a, g 2 ) and (gi,a',g 2 ) constitute different 
transition arrows, even though they start and end at the same states. 
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For all x G X* and a G £, define the extended transition function 5 : S* — > Q by the 
recursion 

5(A) = g , 
5(xa) = 5(5(2;), a). 

For x G £*, we define the output of T on x to be the string T(a;) defined by the recursion 

T(A) = A, 
T(xa) = T(x)v(5(x),a) 

for all 16E* and a G S. Given any FST T, we say n G S* is a minimal program for T if, 
for all 7r' G S < l 7r l, T(7r) 7^ T(tt'); i.e., 7r is a shortest input to T that produces the output 

A FST can trivially act as an "optimal compressor" by outputting A on every transition 
arrow, but this is, of course, a useless compressor, because the input cannot be recovered. 
A FST T = (Q,6,u,q ) is information lossless (IL) if the function x 1— > (T(x),S(x)) is 
one-to-one; i.e., if the output and final state of T on input x uniquely identify x. An 
information lossless finite-state transducer (ILFST) is a FST that is IL. We write FST to 
denote the set of all finite-state transducers, and we write ILFST to denote the set of all 
information lossless finite-state transducers. 

Let S G The finite-state dimension j3] and the finite-state strong dimension [T] 
of S are respectively defined 

diniFs(S') = inf liminf 



TelLFST rwoo n 

and 

TV fC\ ■ ( V \ T ( S f 

DimFs('j) = mi limsup . 

TelLFST n ^ oc n 

Intuitively, the finite-state dimension (resp. strong dimension) of a sequence represents 
the optimal best-case (resp. worst-case) compression ratio achievable on the sequence 
with any information lossless finite-state compressor. (This is a different definition of 
finite-state dimension than that given in the Introduction; Lemma 13.11 tells us that they 
are in fact equivalent.) 

Fix some standard binary representation o~t G {0,1}* of each FST T, and define 
\T\ = \a T \. For all fceN, define 

FST- fc = { T G FST I |T| < k }, 

ILFST- fc = { T G ILFST \ \T\ < k } , 
ILFST <fc- s tate = { 7 1 = (Q ; S, v, q ) G ILFST I \Q\ < k } . 

Note that, for all k G N, ILFST- fc C iLFST- fc " state and ILFST- fc C FST- fc . 
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We next define quantities that may be considered parameterized finite-state analogs of 
Kolmogorov complexity. For all k G N and i£S*, define the k-finite-state decompression 
complexity of x by 

D FS (x) = mm { |tt| | (3T G FST^ fe ) T(tt) = x }, 
the k-IL- finite- state decompression complexity of x by 



7T = X 



n* LFS (x) = mm { \n\ | (3T G ILFST- fc ) T( 

the k-IL- finite- state compression complexity of x by 

C| LFS (x) = min { |tt| I (3T G ILFST- fc ) T(x) = n } , 



and the k-IL- finite- state Lempel-Ziv compression complexity [20] of x by 
CfL FS -Lz(^) = min { |tt| I (3T G ILFST^ state ) T(x) = vr } 



3 Information Loss and Finite-State Decompression 

The following lemma is due to Athreya, Hitchcock, Lutz, and Mayordomo p]. 
Lemma 3.1. For all S G 

lim liminf C ^^ S [ n ) = in f l imiri f ( = dim FS (S)), 



k-*oo n-^oo ft TelLFST n-^oo n 

and 



lim limsup ClfcLFS - Lz(,Srn) = inf limsup |T( ^ 1 ^ (= Dim FS (S)). 

n TGILFST 71 



For all fc G N and all T G ILFST- fc , C^ FS . LZ (5 f n) < C^ LFS (S \ n) < \T(S \ n)\. By 
Lemma I3.1[ we arrive at the following characterization of finite-state dimension. 

Observation 3.2. For all S G 

dim FS (S) = lim liminf C " LFs(,S fn) 



and 



Dim FS (S) = lim limsup -" LFs(S 1 n) 



n 
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We choose this characterization of finite-state dimension to investigate the relationship 
between compression and decompression because, in contrast to Cj LFS _ LZ , the decompres- 
sion complexity measures D| s and Dj LFS would become trivial if the transducers in FST- fc 
and ILFST- fc were limited only to those FST's with at most k states: for each x G £*, a 
1-state FST with x on a transition arrow would suffice to produce x from a single input 
symbol. Therefore, we limit the total description length of the transducer when consider- 
ing FST- fc and ILFST- fc , in order to account for both the number of states and the size 
of the output strings. 

The following well-known theorem [9j \TU\ states that the function from X* to X* 
computed by an ILFST can be inverted - in an approximate sense - by another ILFST. 

Theorem 3.3. For any ILFST T , there exists an ILFST T^ 1 and a constant c G N such 
that, for all x G £*, x \ (\x\ - c) C T^fT^)) C x. 

The following lemma shows that, due to Theorem 13.31 finite- state dimension can be 
characterized in terms of optimal decompression by ILFST's. 

Lemma 3.4. For all S G 

dim FS (S) = hm liminf ^If^H, 

fc^oo n—*oo n 

and 

Dim FS (S) = hm hmsup Dl " LFs( ^ 1 n) 



k >oo n — >na n 



Proof. We prove the result for dim F s- The proof for Dim F s is analogous. 

lim limin 

k— >oo n— >oo 



To show dim FS (S) > lim liminf D iW g N ^ ^ d> d , > dim FS (S), and let e = 1 - f > 



0. By our choice of d', there exists k G N and C G ILFST- fc such that for infinitely 
many n G N, \C(S \ n)\ < d'n. Let D = C^ 1 and c G N be given by Theorem 13.31 
Thus D G ILFST- fc ' for some k', and for every n G N, D(C(S \ n)) = S \ m n where 
n — c < m n < n. If p n — C(S \ n), then for infinitely many n G N, D(p n ) = S \ m n 
where 

bnl < \Pn\ < \Pn\ < \Pn\ <g = rf 

m n ~ n — c ~ n — en ~ n(l — e) 1 — e 
whence dim F g(S') > lim liminf Dilf s^' s '^ . 

k— >oo n—*oo n 

To show dim F g(S') < lim liminf Hilfs(^W ^ d > jj m li m j n f 5ilfs(^N. gy c h i ce Q f 

k^oo n—*oo n k^oo n^oo n 

d, there exists k G N and D G ILFST- fc such that for any n G N, there exists p n G S* 
such that D(p n ) = S \ n and \p n \ < dn. Let C = D~ l and c G N be given by Theorem 
1331 Thus C G ILFST- fc ' for some k', and for every n G AT, C(D(p n )) = p' n where C p n . 
Hence for infinitely many n G N, 

IgCgjrQ] = |c(^(Pn))l = by < bnl < d 

n n n ~ n 
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whence dimps(S') < lim liminf ILFS — — . □ 

fc^oo n— »oo n 

Let T = (Q, 5, u, go) be a FST. Define a path in T to be a finite sequence p = 
(po, a ,pi, ai, . . . ,p n -i, a n _i,p n ), where Pi & Q and a, G £, satisfying, for all < i < n — 1, 
5(pi,cii) = Pi+i- Let |p| = n denote the length of p, the number of transition arrows it 
follows. For < z < n, define 

p \ i = (p ,ao,Pi,ai, • • • ,Pi-i, a>i-i,Pi), 

with p \ = (po)- Let 

u(p) = z/(p , a )^(pi, ai) . . . z/(p n _i, a n _i) 

denote the output of the path p, with u(p) = A if |p| = (i.e., if p = (po) for some po E Q). 
Define a path c = (po, . . . ,p n ) to be a cyc/e in T if |c| > and po = p n - Given a cycle c, 
we say that c is a X-cycle if z/(c) = A. 

Let T = (Q, 5, v, q ) be a FST, and let s, f E Q. If two unequal paths p = (s, . . . , f) 
and g = (s, . . . , f) from s to f satisfy zv(p) = ^(g), we call the pair (p, q) a 6ad pazr (for 
(s, /)). The following property of FST's is well-known [9l [TO]. 

Lemma 3.5. v4 FST IL if and only if it contains no bad pairs. 

Given a path 

P = (Pi,ai, • • • ,Pn-l,a n -l,Pn), 

define a (proper) 1-step subpath 

i iii i i i \ 

P — \Pl ; a l 5 • • • j Pm-li a m-l ! J 

of p, written p' ~<i p, to be a path satisfying m < n and one of the following conditions: 

1. p' is a proper prefix of p: for all 1 < z < m, p- = pj, and = a« when i < m. 

2. p' is a proper suffix of p: for all 1 < i < m, p\ = p i+n _ m , and a- = aj + „_ m when 
z < m. 

3. p' is a cycle-reduced subpath of p. This means that there exists a cycle c in p such 
that removing c from p results in p'. For example, if p has a cycle as follows: 

p = (pi,Oi, • • • ,Pi,a;, . . .,pj,aj,p h b h . . . ,p n ), 

" v ' 

cycle 

then by removing this cycle, p gives rise to the 1-step subpath 

p' = (pi,a 1 ,...,Pi,b h ...,p n ). 
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(a) FST with a non-simple bad pair due to 
cycles 



A Pi 

o/o — 0/0 



0/0 - s' 



1/0 



/ 



0/0 



(b) FST with a non-simple bad pair due to overlapping pre- 
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(c) FST with a non-simple bad pair due to overlapping pre- 
fixes 



(d) FST with only a simple bad pair 



Figure 3.1: Examples of three non-simple bad pairs and a simple bad pair. Figure 



3.1(a) shows an FST in which the only simple bad pair for (s,f) is the pair of paths 
p = (s, 0,pi, 0, /) and q = (s, 1, q±, 0, /). Other bad sets may be constructed from p and q 
by adding cycles (e.g. p' = (s, 0,p x , 0, /) and q' = (s, 1, q u 1, q x , 1, q t , 0, /)), but 

since these can be changed into the bad pair (p, q) by removing the cycles, they are not 



simple bad pairs. In Figure 3.1(b), there is no simple bad pair for (s, /), although there 
is a bad pair consisting of the paths p = (s, 0, s', 0,pi, 0, /) and q = (s, 0, s', 1, gi, 0, /), 
which, by removing the prefix (s, 0) from p and q, forms a simple bad pair for (s',f ). 



Similarly, in Figure 3.1(c) , there is a bad pair, but no simple bad pair, for (s, /), although 
there is a simple bad pair for (s, /'). Finally, in Figure 3.1(d) , the only bad pair for (s, f), 
which is the pair of paths p = (s, 0,pi, 0, /) and q = (s, 1, q±, 0, /), is also simple. 
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Let ^ = -<* denote the reflexive, transitive closure of -<i. We say p' is a subpath of 
p if p' -< p. We say p is a proper subpath of q if p ^ q and p ^ q. Let s,feQ, and let 
(p, g) be a bad pair for (s, /). We say (p, q) is a simple bad pair for (s, /) (a.k.a., (p, q) is 
simple) if, for all p f ^ p and g' ^ g, (j/, g') is a bad pair if and only if (p f , q') = (p, q). 

Note that, if a A-cycle is removed from a path, then the subpath's output is the same 
as that of the path, leading to the following observation. 

Observation 3.6. If{p,q) is a simple bad pair, then neither p nor q contains a X-cycle. 

Intuitively, the paths of a simple bad pair cannot be shrunken through removal of 
prefixes, suffixes, or cycles, while remaining a bad pair. See Figure I3~T1 for an example of 
three types of bad pairs that are not simple, and one bad pair that is simple. A simple 
bad pair (p, q) is "canonical" in the sense that the bad pairs formed by its superpaths are 
bad only because (p, q) is bad, and if (p, q) could be "fixed" somehow, then the bad pairs 
formed by the superpaths of p and q would be fixed as well. This intuition is reinforced 
by the following lemma. 

Lemma 3.7. A FST is IL if and only if it contains no simple bad pairs. 

Proof. Let T be a FST. By Lemma 13. 5[ T is IL if and only if it contains no bad pairs. 
Since every simple bad pair is a bad pair, it suffices to show that, if T is not IL, then it 
contains a simple bad pair. 

Assume that T is not IL. Then by Lemma [3.5[ T contains a bad pair (p,q). If {p,q) 
is simple, then the proof is complete. Otherwise, (p, q) is a non-simple bad pair, which 
means that there exist subpaths p' ^ p and q' ^ q, at least one of them proper, such that 
(p',q') is a bad pair. Note that \p'\ + \q'\ < \p\ + \q\, since at least one of p' or q' is a 
proper subpath. Therefore, if {p\ q') is not a simple bad pair, we can repeat this process 
to produce another bad pair (p", q") with \p"\ + \q"\ < \p'\ + \q'\- However, this sum must 
be positive for any bad pair. Therefore, the process must eventually terminate with a 
simple bad pair. □ 

Lemma 3.8. Let T = (Q, 5, is, q ) be a FST, let f G Q, and let 

X = { !/(/', a) | f eQ,ae^5{f,a) = f} 

be the set of output strings on transition arrows entering f. If X is suffix-free and the 
total number of transition arrows entering f is \X\ (i.e., if every such transition arrow 
has a unique output string), then f is not the final state of any simple bad pair. 

Proof. Let T, f, and X be as in the statement of the lemma. Then by the definition of 
a simple bad pair, for any simple bad pair (p, q) ending in /, the final transition arrows 
(fp,a p ,f) and (f' q ,a q ,f) must be different. Otherwise, p and q could have their last 
transition removed and remain a bad pair, and (p, q) would not be simple. But since X 
is suffix- free, v{p) ^ v{q), so (p, q) cannot be a simple bad pair. □ 

Lemma 3.9. Every FST has a finite number of simple bad pairs. 
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Proof. Let T = (Q, 5, v, g ) be a FST. Let / = max sS Q iaeS {|z/(s, a)|} be the length of the 
longest output string on any transition arrow in T. Then for any pathp in T, \v(p)\ < l\p\- 
Note that, if p contains no A-cycles, then \p\ < \Q\\u(p)\. By Observation 13.61 if (p,q) is 
a simple bad pair, then neither p nor q contains a A-cycle. Thus, for any simple bad pair 
(p,g), since u(p) = u(q), \p\ < \Q\\v(p)\ = \Q\\u(q)\ < \q\\Q\l and likewise, |g| < \p\\Q\l. 
Therefore, for any iV 6 N, there are only a finite number of simple bad pairs (p, q) for 
which \p\ < N or |g| < N. We will complete the proof by showing that any bad pair (p, q) 
such that \p\, \q\ > l\Q\ 3 \T,\ l cannot be simple. 

Let (p, q) be a bad pair such that \p\ > Z|(2| 3 |£| z and \q\ > /|Q| 3 |£|'. If p \ 1 = q \ 1, 
then (p,q) is not simple, so assume that p \ 1 ^ q \ 1. We proceed through the paths 
p and g in stages in an attempt to "approximately synchronize" their outputs. For each 
stage n G N, define the positions i n ,j n G N (with i < i x < . . . and j Q < j\ < . . .) 
recursively as follows. Zq — Jo — 0. For all n G N, let z ra+ i be the smallest integer such 
that v (g \ j n ) n v (p \ i n +i), and let j n+ % be the smallest integer such that v (p f i n +i) E 
v (q r j n+ i). z n and j n ensure that the output of path g at stage n is at least as long as 
the output of path p at stage n, but no longer than is necessary to ensure that this holds, 
and the output of path p at the stage n + 1 is just long enough to extend the output of 
path q at stage n. 

For all stages n > 0, < \u(p \ i n+1 )\-\v(p \ i n )\ < I, < \v(q \ j n+1 )\-\v(q \ j n )\ < I, 
and < \u(q \ j n )\ — \u(p \ i n )\ < I. In other words, the length of the output between 
successive stages in either path grows by at most /, and, in any stage, the amount by 
which the length of g's output exceeds the length of p's output at that stage is less than 
/. These bounds follow from the definition of i n and j n . 

For all n > 0, let p n be the final state of p \ i n , let q n be the final state of q \ j n , and 
let u n G T, <1 be the string such that v(q \ j n ) = u(p \ i n )u n , the "extra extension" of the 
output of path q at stage n. Note that each triple of the form (p n ,qn,Un) is an element 
of the finite set Q x Q x S < ', of cardinality less than |Q| 2 |S|'. As noted earlier, since 
there are no A-cycles in p or q, the length of any path is at most \Q\ times the length 
of its output. Because |p|,|g| > /|Q| 3 |^I') h follows that |z/(p)|, > /|Q| 2 |S|'. Since 

the length of either output increases by at most I with each stage, there are more than 
|(5| 2 |S|' stages. By the pigeonhole principle, at least one triple (pj,g,,?jj) G Q x Q x T, <1 
must appear twice in the stage-by-stage enumeration (p , go, ^o)> (pi, Qi, • • •• 

Let < % < j represent two different stages such that (pi,qi,Ui) = (j)j,qj,Uj). Then 
c p = fa, . . . ,pj) and c q = (q i} . . . ,qj) each represent a cycle in p and q, respectively, of the 
same output length. While these cycles do not have the same output, v(c p ) is a "shifted" 
version of v(c q ): u(c p )ui = Uiv(c q ). Therefore, removing c p from p and c q from q will 
create two different subpaths p' -< p and q' -< q such that v(p') = v(q')- Since i > 0, 
p' \ 1 = p \ 1 ^ q \ 1 = q' \ 1, therefore p' ^ q' . Because f(p') = ^(q 1 ), (p',g') is a bad 
pair, whence (p, g) is not simple. □ 

The following theorem is the main theorem of this paper. It establishes that, unlike 
the trivial case of compression, up to a constant change in the size of the FST's, lossy 
FST's cannot achieve better decompression than ILFST's. 
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Figure 3.2: Part of a lossy FST T before and after the procedure to eliminate one simple 
bad pair. Transition arrows are labeled incompletely for readability, and most output strings are 



not shown; the full formal description of the transformation is given in the text. Figure 3.2(a) 
illustrates that there exist two different paths from some vertex s to some vertex / that produce 
the same output. Here, m > n. States ppi and pppi are other successor states of p\ (besides 



P2), and likewise with the states PP2,PPP2, etc. Figure 3.2(b) shows that, to prevent both paths 
from reaching / with the same output, the upper path is "cloned" by creating clones of the 
states (indicated with a prime) comprising the upper path, and sending T along this new path 
instead, if the symbol a is read. The new states are shown surrounded by dashed lines. The new 
path completely duplicates the behavior of the old path (because each cloned state also clones 
the outgoing transition arrows, including the output strings), unless the second-to-last state of 
the path, p' m , is reached. In this case, instead of going to state /, T goes to state p' end (and 
outputs a string x' possibly different from x), all of whose transition arrows self- loop. Since the 
set X = { x a I a G S } U {x 1 } of in transition arrows to p' cnd is suffix-free, p' end cannot be the 
end state of a simple bad pair. Intuitively, the states p'±, . . . ,p' m behave exactly like p±, . . . ,p m , 
but they remember that they were reached via s, and they prevent T from entering state / at 
the end of the path and thereby losing information. 
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Theorem 3.10. For all k G N, i/iere exists fe'eff such that, for all x G £*, 

d£ps(x) < Dfc(x). 

Proof. Let G N, and let T = (Q, 5, z/, go) G FST- fc be a lossy FST. We construct an 
ILFST T' such that, for every minimal program ir G £* for T, there is a program 7r' G E' 7r ' 
such that T(tt) = T'(tt'). In other words, for all iGE*, the shortest program for T 1 that 
outputs x is no larger than the shortest program for T that outputs x. Since |FST- fc | is 
finite, this establishes the theorem with k' = max TeFST < fc \T'\. 

We proceed as follows. By Lemma [3.71 if T is not IL, then it has one or more simple 
bad pairs. By Lemma 13.91 it has a finite number of these. The construction of T' from 
T will simply eliminate these simple bad pairs one by one, while ensuring that, for each 
n G N such that there is a minimal program of length n for a string x, at least one program 
of length n for x remains. Of course, even though there are a finite number of simple bad 
pairs, it may be the case that the procedure to eliminate one simple bad pair introduces 
others. At the conclusion of the proof we demonstrate how to account for this. 

Let s, / G Q, x G £*, and let (p,q) be a simple bad pair for (s,f) with output x. 
Figure [3721 shows the part of T relevant to the simple bad pair (p, q), and it illustrates the 
procedure to eliminate this simple bad pair, which we now describe formally. 

Write p = (s,a,pi,a 1 ,p2,a 2 ,...,p m ,a m ,f) and q = (s,b,q 1 ,b 1 ,q 2 ,b 2 , . . . ,q n ,b n , /)■ 
Since (p,q) is simple, a ^ b and either p m ^ q n or a m ^ b n (i.e., p and q have different 
first and last transition arrows). 

Assume without loss of generality that m > n, i.e., that \p\ > \q\. Then, if m > n, no 
minimal program will ever (completely) traverse the path p, since any program traversing 
p can be converted to a smaller program, producing the same output, by traversing q 
instead. If m = n, then it may be the case that a minimal program traverses p. However, 
any program that traverses p can be converted into a program of the same length that 
traverses q instead. Hence, if there is a minimal program that traverses p, then there 
is another minimal program producing the same output that never traverses p. We will 
remove p from T in such a way that T's output will remain unaltered on any program 
that never traverses p. 

To remove the bad pair (p, q), we alter T's state set and transition and output functions 
in the following way. Add the states p'^p^, ■ ■ ■ ,p' m >P'end t° Q- Note that even if the path p 
contains cycles and so has fewer than m unique states, we add exactly m + 1 unique new 
states to Q (i.e., we "unroll" any cycles in p). Choose a suffix-free set ACS* such that 
| A | = |E| + 1. Assign to each a G £ a unique element x a G X, and let x' G A denote 
the remaining element of A not assigned to any a G £. Alter the transition and output 
functions from S and v to 5' and z/, respectively, as follows. 

1. Let 5'(s, a) = p[. 

2. For all 1 < i < m, let <5'Q»-, <Xj) = p' i+1 and v'(p'i, di) = v{pi, aj). 

3. For all 1 < i < m and a G £ — {a^}, let £'(p-, a) = 5(j>i, a) and v'{p'^ a) = u(j>i, a). 
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4. Let 5'{p' m , a m ) = p' end and v'{p' m , a m ) = x'. 

5. For all a e E, let 5'(Pen d , a) = p' cnd and ^'(Pcnd> «) = x a . 

Let p' = (s, a,p[, ai,p' 2 , 0,2, ■ ■ ■ ,p' m , a mi p' end ) denote the new path taken by strings that 
would have traversed the path p. 

Let T\p denote the FST obtained by altering T in this way. It is clear that T(ir) = 
T\p(ir) for any program 7r G X* that, when given as input to T, never causes T to traverse 
p. Since every output string x of T has at least one minimal program for T that never 
traverses p, this alteration does not increase the complexity of any string relative to T\p. 

Recall that there are a finite number of simple bad pairs in T. We now demonstrate 
that repeated application of this procedure to T will eventually rid T of all simple bad 
pairs, even if new simple bad pairs are introduced by the procedure itself. Consider how 
new simple bad pairs may be introduced by the procedure just described. Since the only 
existing transition arrow that moved was (s, a,pi) (to (s, a,^)), and since this is the only 
transition arrow into any state of the path p' from outside of p' (see the dashed lines in 
Figure 1372]) . for any new simple bad pair (r, t), either (pQ) (r, t) must have one of its paths 
traverse this transition arrow, or (j2J) (r, t) must originate from one of the "cloned" states 

Pl-i ■ ■ ■ jPm- 

(1) Suppose that a new simple bad pair (r, t) has a path (assume it is r) that traverses 
the transition arrow (s,a,p[). Then, either (Hal) r continues all the way along the 
path p', (llbl) r terminates on p\ for some i, or (TTcj) r leaves p' before reaching p' end . 

(a) If r continues all the way to p' end , then, because the set of outputs on the transition 
arrows into p' end is a suffix-free set, by Lemma 13751 p' cnd cannot be the end state 
of a simple bad pair, so (r, t) is not a simple bad pair. 

(b) If r terminates on p\ for some i, then note that p\ has only one in transition arrow, 
and any singleton set is trivially suffix-free. By Lemma 13781 p\ is not the end state 
of a simple bad pair, so (r, t) is not a simple bad pair. 

(c) If r leaves p 1 before reaching p' end , then, since the path p' otherwise replicates the 
behavior of p, this will result in a new simple bad pair that simply replaces an 
old simple bad pair that was destroyed when the transition arrow (s,a,pi) was 
removed. Therefore, no paths of net new simple bad pairs traverse the transition 
arrow (s, a,,p'x)- 

(2) Suppose that a new simple bad pair (r', t'), where r' = (p[, a r , . . .) and t' = (p[, a t , . . .), 
originates from one of the cloned states p' x , . . . , p' m . (fla|) and (flbl) tell us that no simple 
bad pair can end in any state along p' . Thus, (r', t') is "equivalent" to an existing 
simple bad pair (r, t), where r = (pi,a r . . .) and t = (pi,at . . .), in the sense that their 
paths traverse the same states, except for the fact that one of r' (resp. t') has an 
initial segment that traverses the cloned states Pi,p' i+ i, ■ ■ ■ instead of Pi,p i+ i, ... for a 
time before leaving p' and "rejoining" with r (resp. t). Given two simple bad pairs, 
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place them in the same equivalence class if they were initially the same simple bad 
pair, but are now different because their first state was cloned. While (r, t) and (r 1 , t') 
are not the same simple bad pair, they are in the same equivalence class. When the 
simple bad pair (r, t) is fixed by the procedure, the transition arrow (pt, a r , ri) will be 
redirected to (p^, a r , r'{), where r" is the first state in a new path r" introduced into T 
(r" plays the same role as p[ did in the bad pair (p, q)). To simultaneously fix (r', t'), 
we redirect the transition arrow (j/ i ,a r: r[) to (p[, well; in other words, if T 

is in state p\ and reads a r , send T along the same new path r" to which it would be 
redirected if it were in state pi. Since r" ensures the path r does not lose information, 
r" will do so for r' as well. Of course, it is possible for the procedure to make a clone 
of a clone, which would admit more than 2 simple bad pairs to the same equivalence 
class; this case is handled in the obvious way, where all of the longer paths of each 
simple bad pair in the class would be redirected to the same newly created path in 
one step. 

Since we have altered the procedure to fix all simple bad pairs in an equivalence class in 
one step, it is clear that the number of equivalence classes will decrease by one each time 
the procedure is applied. Since each equivalence class described in (JSJ) will correspond 
to one simple bad pair that was present in the initial FST T, by iteratively applying 
this procedure to each equivalence class of simple bad pairs, all simple bad pairs will be 
eliminated in a finite number of steps. □ 

Theorem 13.101 implies a new characterization of the finite-state dimension of individual 
sequences in terms of decompression by (possibly lossy) finite-state transducers. 

Theorem 3.11. For all S £ 

D*s(5 r n) 



dimFs(S') = lim liminf 



k— >oo n^oo n 

and 



UimFs(>j) = hm hmsup- 



n 



Proof. Since every ILFST is a FST, for all k e N and x £ £*, Df LFS (x) > D| s (x). The 
theorem follows by Theorem 13.101 and Lemma 13.41 □ 

Finally, we note that an analog of the definition of finite-state dimension given in §2 
holds for lossy decompressors as well. Given a sequence R £ S°° and a FST T, define 
T(R) to be the output of T on R, the shortest element S £ S°° U S* such that, for all 
n £ N, T(R \n)QS. 
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Theorem 3.12. For all S G 



777* 

dimFs(S') = inf liminf 



tgfst,_rge°° nwoo \T(R\m)\ 

T(R)=S 

an d m 

D imps (5') = inf lim sup—— — — — r 

bbK ' TGFST.iJGE- \T(R m) 

T(R)=S 



Proof. We show the result for dirrips; the proof for Dimpg is analogous. By Theorem 13.31 

dimFsfS 1 ) = inf liminf J — - — - 

TGILFST n->oo n 

m 



inf liminf 



TGILFST,_RGE°° m-»oo \T(R \ m) 
T(R)=S 

> inf lim inf 



TGFST.iJGS 00 m^oo \T(R t m) 
T(R)=S 

> lim liminf - FSV 1 ; 



k— >oo n^oo n 

= dimps(5), 

because allowing the FST's to be lossy, and allowing a different FST of length < k for 
each prefix of S, cannot increase the complexity of S. □ 
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